Cries and Whispers - Classification of Vocal Effort in Expressive Speech
نویسنده
چکیده
The expansion of the video games industry raises innovative and challenging issues for speech technologies, e.g. the development of automatic content-based speech processing and speech recognition systems in the context of video games postproduction and voice casting. This paper presents a large-scale study on the classification of vocal effort in expressive speech for video games. Changes in vocal effort conduct to substantial modifications in the configuration of voice production mechanisms. In particular, registers of vocal effort affect especially voice quality which reflects qualitative modifications of the source excitation characteristics. This study introduces robust source characteristics to measure various types of voice quality (e.g., breathy, creaky, tense) for the classification of vocal effort into whispered, normal, and shouted speech. The system is evaluated in the real scenario of video games production with the complete speech recordings of a massive role-playing video game. The proposed features significantly improve the classification from 81.1% to 87% over conventional MFCCs. These advancements confirm the role of the source and voice quality for the description of changes in vocal effort. Index Terms : speech recognition, vocal effort, voice quality, glottal source, GMM-UBM/SVM.
منابع مشابه
Acoustic correlates for perceived effort levels in expressive speech
Actors and other vocal performers vary their speech across the continuum of vocal effort to express ideas, emphasize thoughts, communicate emotions, and create drama. They are experts at vocal expression. To analyze this range of expression across effort levels, we curated a corpus of professional actors’ Hamlet soliloquy performances and present an acoustic feature set and classification model...
متن کاملReconstruction of continuous voiced speech from whispers
Whispers are an important secondary vocal communications mechanism that can be necessary for communicating private information and which are an integral aspect of natural human-tohuman dialogue. Furthermore, they may be the primary vocal communications method of those suffering from certain forms of aphonia, such as laryngectomees. This paper considers the conversion of continuous whispers to n...
متن کاملTowards Annotation of Nonverbal Vocal Gestures in Slovak
The paper presents some of the problems of classification and annotation of speech sounds that have their own phonetic content, phonological function, and prosody, but they do not have an adequate linguistic (or text) representation. One of the most important facts about these "nonverbal vocal gestures" is that they often have a rich semantic content and they play an important role in expressiv...
متن کاملA comprehensive vowel space for whispered speech.
Whispered speech is a relatively common form of communications, used primarily to selectively exclude or include potential listeners from hearing a spoken message. Despite the everyday nature of whispering, and its undoubted usefulness in vocal communications, whispers have received relatively little research effort to date, apart from some studies analyzing the main whispered vowels and some q...
متن کاملParametric model for vocal effort interpolation with harmonics plus noise models
It is known that voice quality plays an important role in expressive speech. In this paper, we present a methodology for modifying vocal effort level, which can be applied by text-to-speech (TTS) systems to provide the flexibility needed to improve the naturalness of synthesized speech. This extends previous work using low order Linear Prediction Coefficients (LPC) where the flexibility was con...
متن کامل